Improvement of Jarvis-Patrick Clustering Based on Fuzzy Similarity

نویسندگان

  • Ágnes Vathy-Fogarassy
  • Attila Kiss
  • János Abonyi
چکیده

Different clustering algorithms are based on different similarity or distance measures (e.g. Euclidian distance, Minkowsky distance, Jackard coefficient, etc.). Jarvis-Patrick clustering method utilizes the number of the common neighbors of the k-nearest neighbors of objects to disclose the clusters. The main drawback of this algorithm is that its parameters determine a too crisp cutting criterion, hence it is difficult to determine a good parameter set. In this paper we give an extension of the similarity measure of the Jarvis-Patrick algorithm. This extension is carried out in the following two ways: (i) fuzzyfication of one of the parameters, and (ii) spreading of the scope of the other parameter. The suggested fuzzy similarity measure can be applied in various forms, in different clustering and visualization techniques (e.g. hierarchical clustering, MDS, VAT). In this paper we give some application examples to illustrate the efficiency of the use of the proposed fuzzy similarity measure in clustering. These examples show that the proposed fuzzy similarity measure based clustering techniques are able to detect clusters with different sizes, shapes and densities. It is also shown that the outliers are also detectable by the proposed measure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of Fuzzy Data Sets Based on Particle Swarm Optimization With Fuzzy Cluster Centers

In current study, a particle swarm clustering method is suggested for clustering triangular fuzzy data. This clustering method can find fuzzy cluster centers in the proposed method, where fuzzy cluster centers contain more points from the corresponding cluster, the higher clustering accuracy. Also, triangular fuzzy numbers are utilized to demonstrate uncertain data. To compare triangular fuzzy ...

متن کامل

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...

متن کامل

Clustering Using a Similarity Measure Based on Shared Near Neighbors

A nonparametric clustering technique incorporating the concept of similarity based on the sharing of near neighbors is presented. In addition to being an essentially paraliel approach, the computational elegance of the method is such that the scheme is applicable to a wide class of practical problems involving large sample size and high dimensionality. No attempt is made to show how a priori pr...

متن کامل

Improving Imbalanced data classification accuracy by using Fuzzy Similarity Measure and subtractive clustering

 Classification is an one of the important parts of data mining and knowledge discovery. In most cases, the data that is utilized to used to training the clusters is not well distributed. This inappropriate distribution occurs when one class has a large number of samples but while the number of other class samples is naturally inherently low. In general, the methods of solving this kind of prob...

متن کامل

A Soft Hierarchical Algorithm for the Clustering of Multiple Bioactive Chemical Compounds

Most of the clustering methods used in the clustering of chemical structures such as Ward’s, Group Average, Kmeans and Jarvis-Patrick, are known as hard or crisp as they partition a dataset into strictly disjoint subsets; and thus are not suitable for the clustering of chemical structures exhibiting more than one activity. Although, fuzzy clustering algorithms such as fuzzy cmeans provides an i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007